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Abstract. Magnetoencephalographic (MEG) measurements record magnetic fields generated from 
neurons while information is being processed in the brain. The inverse problem of identifying 
sources of biomagnetic fields and deducing their intensities from MEG measurements is ill-posed 
when the number of field detectors is far less than the number of sources. This problem is less 
severe if there is already a reasonable prior knowledge in the form of a distribution in the intensity 
of source activation. In this case the problem of identifying and deducing source intensities may be 
transformed to one of using the MEG data to update a prior distribution to a posterior distribution. 
Here we report on some work done using the maximum entropy method (ME) as an updating tool. 
Specifically, we propose an implementation of the ME method in cases when the prior contain 
almost no knowledge of source activation. Two examples are studied, in which part of motor cortex 
is activated with uniform and varying intensities, respectively. 



INTRODUCTION 

Magnetoencephalographic (MEG) measurements record magnetic fields generated from 
small currents in the neural system while information is being processed in the brain 
[1]. In the classical cortical distributed model, the activation of neurons in the cortex is 
represented by sources of currents whose distribution approximates the cortex structure, 
and MEG measurements provide information on the current distribution for a specific 
brain function [1]. In practice, given an set of current sources J^, = {\, . . . ,N} and a 
set of magnetic field detectors labeled by I = {1, . . . , J}, the relation between the field 
strengths M/ measure by the detectors and the sources can be expressed as 



LiAf7/3+Z; / = !,..., J, /3 = 1,...,M (1) 



where A is a J x matrix whose elements Af are known functions of the geometric 
properties of the sources and the detectors, as determined by the Biot-Savart law [2], 
and X indicates noise, to be ignored here. The detail form of A applicable to the present 
study is given in [3]. In tensor analysis notation Eq. (1) may be simply expressed as 
M = A ■ J. In what follows, we adopt the convention of summing over repeated index (/3 
in Eq. (1)). In standard MEG Eq. (1) appears as an inverse problem: the measured field 
strengths M are given and the unknowns are J. Since the total number d of detectors that 
can be deployed in a practical MEG measurement is far less than the number of current 
sources, the answer to Eq. (1) is not unique and the inverse problem is ill-posed [6, 7]. 



A number of methods have been proposed to solve Eq. (1), including the least- 
square norm [1], the Bayesian approach [4, 5], and the maximum entropy approach 
[6, 7, 8, 9, 10, 1 1]. In the method of maximum entropy (ME) the MEG data, in the form 
of the constraints M - A-J = 0, is used to obtain a posterior probability distribution 
for neuron current intensities from a given prior (distribution). In [11], the method is 
implemented by introducing a hidden variable denoting the grouping property of firing 
neurons. Here we develop an approach such that ME becomes a tool for updating the 
probability distribution [12, 13, 14, 15]. 



COMPUTATIONAL DETAIL 

ME updating procedure. Let the set r to be current intensities r^ caused by neuron 
activities in the cortex at sites j8 = 1,...,A^, and pp{rp) be the probability current 
intensity distribution at site j8 . Assuming the N current sources to be uncorrelated, we 
define the joint probability distribution as P(r) = 0^=1 Ppi^p)- The current at site j8 is 
then/^ = (rp) = J rpP(r)drp. Suppose we have prior knowledge about neuron activities 
expressed in terms of the joint prior M(r) = 0^=1 ^p i^p)- The implication is that would 
produce currents J that does not satisfy Eq. (1) (here without noise). Our goal is to 
update from this prior to a posterior P{r)dr that does satisfy Eq. (1). The ME method 
states that given M(r) and the MEG data, the preferred posterior P{r)dr is the one that 
maximizes relative entropy S[P, u], 

S[P, ii] = -JdrP (r) In {P (r) /u (r)) (2) 

subject to constraints Eq. (1). Here P(r) is given by the variational method, 

P(r)(ir = Z^^M(r)exp(— AAr)(ir, Z = J dr u{r) Qxip{—XAr) e^^ , (3) 

where A is a row vector of length d whose fi^ element A' is the Lagrangian multiplier 
that enforces the l^^ constraints in Eq. (1), AAr = X^A^{rp) and the last equality in 
Eq. (3)defines the quantity F. Because A is known, P(r) is determined by M(r) and the 
A's. The elements of A are the solutions A' = A' in 



_ =Mi, 1=1,.. .,d. (4) 



This is the primal-dual attainment equation derived in [10]. Because Eq. (4) is a non- 
linear equation in the A's, the search for the A's is non-trivial. 

A standard approach is by iteration, specifically by successive steps of updating P{r). 

To demonstrate this we simplify notation and write = A'A^, or simply v = A A. 

Expectation value of currents can then be calculated through (rp) = —dlnZ/dv^. The 

updating process may now start with P^ = u{r) and a set A[o] and proceed with 

{r)dr = Zj^iP[,._i](r)e-^Vr, Z[,.] = J P[i_^{r)e-^'dr, (5) 



where [z = 1,2, ■ ■ ■] denotes the i updating step and v = A[;_i]A. At each step ^[i-i] 
is updated to Aj,] according to Eq. (4). Formally the updating process converges at the 

solution of Eq. (4), which is a fix-point A* of Eq. (5): Aj,] = A = A*. Then the current 
intensities will be fix-points (r^)* such that M = = A(r)*. In practice the fix-point 
may not be reached with infinite accuracy within finite time, and the updating may be 
terminated when the quantity 

5„„/= -10 ln(||M,-M||Vl|Mf) (6) 

attains a predetermined value. It is important to stress that unless the prior ^(r) properly 
reflects sufficient knowledge about neuron activities, there is no guarantee that the fix- 
point (r)* is closely related to the actual current intensities. 




FIGURE 1. A (left panel): Distribution of current sources used in study. The motor cortex is represented 
by the patches 7 to 10. B (right panel): Distribution of magnetic field detectors on a hemisphere 2 cm from 
the scalp. 

Sources with Gaussian distributed intensities. Pertinent general information on the 
geometric structure of the cortex and neuron activities, readily obtained from experi- 
ments such as functional magnetic resonance imaging (fMRI), positron emission tomog- 
raphy (PET), etc., is incorporated in a distributed model [4] in which current sources, 
modeled by magnetic dipoles, are distributed in regions below the scalp. A schematic 
coarse-grained representation of this model is shown in Fig.lA, where 1024 dipoles are 
placed on 16 planar patches, 64 dipole to a patch. Eight of the which are parallel to the 
scalp and the other eight normal. Regardless of the orientation of the patch, all dipoles 
are normal to the cortical surface with the positive direction pointing away from the 
cortex. 

Information contained in a prior may be qualitative instead of quantitative. Here, our 
prior will include the information that the activation resides in a part of the motor cortex 
that in Fig. 1 A is represented by the patches 8 and 9, and utilize this prior information by 
placing a higher concentration of field detectors in the area nearest to those in Fig. IB. 
There is additional information such as neuronal grouping property. We follow Amblard 
et al. [11] and group dipoles into cortical regions Q, k= 1,2,...^, each containing 
dipoles with the n^t's satisfying = Ef=i ^k- Associated with Q is a hidden variable Sk 
that expresses regional activation status: 8^=1 denotes an "excitatory state", or a state of 



out-going current; Sj^=-l denotes an "inhibitory state" (in-going current); 5,^=0 denotes 
a "silent state" (no current). With this grouping, the prior M(r) reduces to a sum of 
probability distributions M(r,5) over all possible configurations of 5 = {Si, ■ • -Sk}'. 



u{r)dr = J^^M(r,S)Jr = J^^M(r|S);r(S)tir = H^i Y.s,^(''k\SkMSk)dr, (7) 



where = {^tj € Q} specifies the current densities of the sources in Q; ju(rjt|5^) = 
Y[r]eCi,l^i^ri\Sk) is the conditional joint probability of the dipoles in Q being in state 
5yt and having current densities r^^; ^{S^) is the probability of the region Q being in 
activation state S^. For fi{ri^\Sk) we adopt a Gaussian distribution for activated states 
[6,7, 11]: 



n — ^ 



exp 



(8) 



For simplicity, all current distributions have the same standard deviation a. Current 
sources at different sites have different mean intensities P;[o] whose signs also indicate 

the state of activation: positive for excitation and negative for inhibition. For silent states 
we let^i{rk\Sk = 0) = d{rk) = Ur^ec^irr^)- We also write 7t{Sk ^ 0) = 1 - 7t{Sk = 0) = 
Uk, where < a^t < 1. We thus have, 

"G(r) = nti Ls^^i^'klSkMSk) = [(1 - + (^k^i{rk\Sk ^ 0)] , (9) 

where the subscript denote Gaussian distribution. This form simplifies the computation 
significantly. At the fi^ iteration we have: 



4l= Ilk=i[nj=i^k[j]) [^-^k[i-i] + ak[i-i]]^j^,^w{FcM)\, (10) 

{fri)[i\ = ^k[i\Pri[i\i Pt)[(] = Pr)[i-1] " <^'^ri[i-l]i ^ Sk), (11) 

0Ck[i\ = ockli-i] ^"M'-i] + (1 ~ %-!]) Rexp j , (12) 

where ^^[o] = Uk and oCkm = a^[i]; = E^eQ(2(7)-i (Pn[«l)^ " {Pvli-nY 

In the absence of any other prior information we take a to be a random number 
(between zero and one), |pj3[o]| to have a random value up to 20 nA, the maximum 
current intensity that can be generated in the brain, and a to be the mean of |Pj3[o]|- 
However, the inverse problem being ill-posed, and since the prior contains no activation 
information, the above strategy produces poor results as expected. 

Better priors by coarse graining. In the absence of prior information on activation 
pattern, one way to acquire some "prior" information from the MEG data itself is 
by coarse graining the current source. Coarse graining reduces the severity of the ill- 
posedness because the closer the number of current sources to the number of detectors, 
the less ill-posed the inverse problem. Within the framework of the ME procedure 
described above, coarse graining can be simply achieved by setting p,j for all 77 in a 



given region Q to be the same. Here we choose to take an intermediate step that disturbs 
the standard ME procedure even less, by replacing the second relation in Eq. (1 1) by 



Ptjh = - (13) 

Note that in Eq. (11) (r,|) — UkP-q depends on the probability common to region Q, 
whereas Pr] does not explicitly. By replacing Prj[i i] by {r-q)^i_i\ on the right hand side 
of Eq. (13), we force the updated in each iteration to be more similar (although not 
necessarily identical). In practice we only use this modified ME to get information on the 
activity pattern, rather than the intensity, of the sources. Let (r)c be the current intensity 

set obtained after a convergence criterion set by requiring > B]nse (Eq. (6)). We now 
define a better prior set of Gaussian means Pc, where, in units of nA, 

_ r sign((r^)c) 20, |(r^)c|>2, 
^^c-|o, |(r^)c|<2. (14) 

These quantities, together with the obtained probabilities CUkc for the regions Q, define 
a prior probability P^'^^{y), which may then be fed into the standard ME procedure for 
computing (r). 

This procedure may be repeated by requiring Bmse to be not less than a succession 

of threshold values, Bm}) < Bms) < Bmse < ■ ■, such that a successive level of better 
priors pd, pel, Pc3, ■ ■ and P^^^^ (r) , P^^^^ (r) , P^^'^^ (r) , . . ., may be obtained. Eventually 
a point of diminishing return is reached. In this work we find the second level prior is 
qualitatively better than the first, and the third is not significantly better than the second. 



RESULTS 

In the following two examples, the 1024 current courses are partitioned into 16 patches, 
eight (4 cm wide and 3.3 cm long) parallel and eight (4 cm wide and 2.3 cm deep) 
normal to the scalp (Fig.lA). On each patch lies a 8x8 rectangular array of sources that 
are divided into 16 four-source groups; that is, A'=256. The interstitial distances on the 
horizontal (vertical) patches are 0.57 and 0.47 cm (0.57 and 0.33 cm), respectively. The 
distance between the adjacent vertical patches are normally 0.55 cm, but the distance 
^89 will be varied for testing, see below. The detectors are arranged in a hemisphere 
surrounding the scalp as indicated in Fig. IB. The matrix A of Eq. (1) is given in [3]. 
The ME procedure is insensitive to a in the range 5< C7 <100. In the coarse graining 

procedure we set 5^j/=100 and 5A,j/=150. As noted previously, coarse graining a 
third time did not produce meaningful improvement on the prior. In the two examples, 
artificial MEG data are generated by having the sources on patch have uniform and 
varied current intensities, respectively. 

Uniform activation on patch 8. In this case the "actual" activity pattern is: the 64 

sources on patch 8 each has a current of 10 nA, and all other sources are inactive 
(Fig.2A.). With the distance dgg set to be 0.55 cm, the results in the first and second 
rounds of searching for a better prior, and in the final ME procedure proper are shown 



in Fig.2B. In the plots, B^se is the defined in Eq. (6) and mse is defined as 

m.e = -101n(||(r)[,]-(r)||VlKr)f), (15) 
where (r) represents the actual source current intensity and the index / indicates the 




FIGURE 2. A (left panel): Contour plot indicates all 64 current sources on patch 8 are activated with 
an intensity of 10 nA. Gray scale on the right shows intensity level in units of 10 nA. Artificial MEG data 
M used in this section are generated through Eq. (1). B (right panel): B^se (top panel) and mse (bottom 
panel) vs. iteration number. 

iteration number. The solid triangles, squares, and crosses, respectively, give results from 
ME iteration procedures for constructing the first prior, second prior, and posterior. It is 
seen that Bmse rises rapidly in the search for the first prior (solid triangles); four iterations 
were needed for B^se to reach 100. B^se is less than 100 at the beginning of the second 
prior search because the prior values for this search is not the posterior of the previous 
search, but is related to it by Eq. (14). The same goes with the the relation between 
the beginning of the ME proper (crosses) and the end of second prior search (squares). 
In the search for the second prior, 5^^^. increases slowly after the seventh iteration, but 
eventually reaches 150 at the 12* iteration. This already suggests that a round of search 
for a still better prior will not be profitable. In the ME procedure proper, Bmse reaches 
150 quickly at the fourth iteration, followed by a slow rise. After reaching 190 at the 
14* iteration the rise is very slow; the final value at the 26* iteration is 195. 

The dependence of the mse value on the ME procedures and the iteration numbers 
essentially mirrors that of B^se- The mse for the final posterior is 68, which corresponds 
to an average of 3.3% error on the current intensities. 

Resolving power as a function of dgg. We tested the resolving power of our ME 
procedure as a function of d^g. With uniform activation on patch 8, the computed mse 
values versus d^g are plotted in Fig. 3 A. The general trend is that mse decreases with 
decreasing dgg as expected: mse = 60 ± 10 when dgg >0.044 cm; mse drops sharply 
when d^g is less than 0.04 cm; mse is less than 8 when d^g is less than 0.0044 cm. In the 
last instance the ME procedure loses its resolving power because the error on the current 
intensity is about 70%. On the other hand, mse = 60 ± 10 implies an error of 5.6±2.6%. 
This means that if an error of no more than 8% is acceptable, the ME method should be 
applicable to a source array whose density is up to one hundred times higher than that 
used in the present study. 
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FIGURE 3. A (left panel): mse value vs. the separation ds9 between patches 8 and 9. The distances 
c/89=0.55, 0.0275 and 0.0035 cm are marked out and labeled (1), (2) and (3), respectively. B (right panel): 
Reconstructed (r) vs. source number for the cases (1) (solid line) and case (2) (dashed line) in A. 



Resolving power as a function of depth. Signals from sources deeper in the cortex 
are in general weaker at the detectors and are harder to resolve. This effect is shown in 
Fig.3B. The abscissa gives the source numbers on patch 8 (449 to 512) and patch 9 (513 
to 576). The sources are arranged in equally spaced rows of eight, such that 449-456 and 
513-520 are just below the scalp, 457-464 and 521-528 are 0.328 cm from the scalp, and 
so on. Fig.3B shows that when ^89=0.55 cm (solid line), the ME procedure can resolve 
all sources (up to a maximum depth of 2.3 cm). This resolving power decreases with 
decreasing dgg. When ^89=0.0275 cm the ME procedure fails for sources at a depth of 2 
cm or greater (that is, sources 496-512 and 561-576 on patches 8 and 9, respectively). 
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FIGURE 4. A (left panel): Bmse and mse values for the case of varied activation on patch 8 (see text). 
B (right panel): Reconstructed (dash line) and (artificially generated) actual (black Une) (r) vs. current 
source number on patches 8 and 9. 

Varied activation on patch 8. We tested the ME procedure in a case with a slightly 
more complex activation pattern (still unknown to the prior): with ^89=0.55 cm, all 
current sources on patch 8 are activated, with those near the center of the patch having 
higher intensities than those in the peripheral. All other sources are silent. We used 
random source current densities as zeroth order prior, employed coarse graining twice. 



then used the standard ME to obtain the final reconstructed current intensities. With 
<i89=0.55 cm, the dependence of Bmse and mse on the iteration number is shown in 
Fig.4A. Interestingly for the ME procedure proper (crosses), only the Bmse improves 
with iteration, whereas the mse value remains a constant at about 18. This value is large 
compared to the value of 60±10 obtained for the case of uniform activation (Fig.2B). 
The solid and dashed lines in Fig.4B indicate the actual and reconstructed current 
intensities, respectively, for the sources on patches 8 and 9. These show that the poor 
result is caused by reconstructed false activation of sources 550 to 580 on patch 9. When 
sources other than those on patch 8 are forcibly forbidden to activate, mse increases to 

30 (bottom plot in Fig.4A), corresponding to a 22% error, suggesting that better results 
may be obtained when more reliable priors are given. This remains to be investigated. 
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